Sliced Inverse Regression with Variable Selection and Interaction Detection
نویسندگان
چکیده
Variable selection methods play important roles in modeling high dimensional data and are keys to data-driven scientific discoveries. In this paper, we consider the problem of variable selection with interaction detection under the sliced inverse index modeling framework, in which the response is influenced by predictors through an unknown function of both linear combinations of predictors and interactions among them. Instead of building a predictive model of the response given combinations of predictors, we start by modeling the conditional distribution of predictors given responses. This inverse modeling perspective motivates us to propose a stepwise procedure based on likelihood-ratio tests that is effective and computationally efficient in detecting interaction with little assumptions on its parametric form. The proposed procedure is able to detect pairwise interactions among p predictors with a computational time of O(p) instead of O(p2) under moderate conditions. Consistency of the procedure in variable selection under a diverging number of predictors and sample size is established. Its excellent empirical performance in comparison with some existing methods is demonstrated through simulation studies as well as real data examples.
منابع مشابه
Sliced inverse regression for high-dimensional time series
Methods of dimension reduction are very helpful and almost a necessity if we want to analyze high-dimensional time series since otherwise modelling affords many parameters because of interactions at various time-lags. We use a dynamic version of Sliced Inverse Regression (SIR; Li (1991)), which was developed to reduce the dimension of the regressor in regression problems, as an exploratory tool...
متن کاملVariable Selection for General Index Models via Sliced Inverse Regression
Variable selection, also known as feature selection in machine learning, plays an important role in modeling high dimensional data and is key to data-driven scientific discoveries. We consider here the problem of detecting influential variables under the general index model, in which the response is dependent of predictors through an unknown function of one or more linear combinations of them. ...
متن کاملForward Selection and Estimation in High Dimensional Single Index Models
We propose a new variable selection and estimation technique for high dimensional single index models with unknown monotone smooth link function. Among many predictors, typically, only a small fraction of them have significant impact on prediction. In such a situation, more interpretable models with better prediction accuracy can be obtained by variable selection. In this article, we propose a ...
متن کاملBoiling Points Predictions Study via Dimension Reduction Methods: SIR, PCR and PLSR
Variable selection is an important tool in QSAR. In this article, we employ three known techniques: sliced inverse regression (SIR), principal components regression (PCR) and partial least squares regression (PLSR) for models to predict the boiling points of 530 saturated hydrocarbons. With 122 topological indices as input variables our results show that these three methods have good performanc...
متن کاملJournal de la Société Française de Statistique Comparison of sliced inverse regression approaches for underdetermined cases
Among methods to analyze high-dimensional data, the sliced inverse regression (SIR) is of particular interest for non-linear relations between the dependent variable and some indices of the covariate. When the dimension of the covariate is greater than the number of observations, classical versions of SIR cannot be applied. Various upgrades were then proposed to tackle this issue such as RSIR a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013